| Number of Variables | 6 |
|---|---|
| Number of Rows | 404290 |
| Missing Cells | 3 |
| Missing Cells (%) | 0.0% |
| Duplicate Rows | 0 |
| Duplicate Rows (%) | 0.0% |
| Total Size in Memory | 103.9 MB |
| Average Row Size in Memory | 269.5 B |
| Variable Types |
|
| id is uniformly distributed | Uniform |
|---|---|
| qid1 and qid2 have similar distributions | Similar Distribution |
| qid2 is skewed | Skewed |
| question1 has a high cardinality: 290456 distinct values | High Cardinality |
| question2 has a high cardinality: 299174 distinct values | High Cardinality |
| is_duplicate has constant length 1 | Constant Length |
numerical
| Approximate Distinct Count | 404290 |
|---|---|
| Approximate Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 6468640 |
| Mean | 202144.5 |
| Minimum | 0 |
| Maximum | 404289 |
| Zeros | 1 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 0 |
|---|---|
| 5-th Percentile | 20214.45 |
| Q1 | 101072.25 |
| Median | 202144.5 |
| Q3 | 303216.75 |
| 95-th Percentile | 384074.55 |
| Maximum | 404289 |
| Range | 404289 |
| IQR | 202144.5 |
| Mean | 202144.5 |
|---|---|
| Standard Deviation | 116708.6145 |
| Variance | 1.3621e+10 |
| Sum | 8.1725e+10 |
| Skewness | 4.7162e-16 |
| Kurtosis | -1.2 |
| Coefficient of Variation | 0.5774 |
numerical
| Approximate Distinct Count | 290654 |
|---|---|
| Approximate Unique (%) | 71.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 6468640 |
| Mean | 217243.9424 |
| Minimum | 1 |
| Maximum | 537932 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 1 |
|---|---|
| 5-th Percentile | 11290.9 |
| Q1 | 74437.5 |
| Median | 192182 |
| Q3 | 346573.5 |
| 95-th Percentile | 496522.55 |
| Maximum | 537932 |
| Range | 537931 |
| IQR | 272136 |
| Mean | 217243.9424 |
|---|---|
| Standard Deviation | 157751.7 |
| Variance | 2.4886e+10 |
| Sum | 8.783e+10 |
| Skewness | 0.3797 |
| Kurtosis | -1.0906 |
| Coefficient of Variation | 0.7262 |
numerical
| Approximate Distinct Count | 299364 |
|---|---|
| Approximate Unique (%) | 74.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Memory Size | 6468640 |
| Mean | 220955.6553 |
| Minimum | 2 |
| Maximum | 537933 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negatives | 0 |
| Negatives (%) | 0.0% |
| Minimum | 2 |
|---|---|
| 5-th Percentile | 11264 |
| Q1 | 74727 |
| Median | 197052 |
| Q3 | 354692.5 |
| 95-th Percentile | 499507.2 |
| Maximum | 537933 |
| Range | 537931 |
| IQR | 279965.5 |
| Mean | 220955.6553 |
|---|---|
| Standard Deviation | 159903.1826 |
| Variance | 2.5569e+10 |
| Sum | 8.933e+10 |
| Skewness | 0.3454 |
| Kurtosis | -1.1426 |
| Coefficient of Variation | 0.7237 |
categorical
| Approximate Distinct Count | 290456 |
|---|---|
| Approximate Unique (%) | 71.8% |
| Missing | 1 |
| Missing (%) | 0.0% |
| Memory Size | 51178824 |
| Mean | 59.5369 |
|---|---|
| Standard Deviation | 29.9405 |
| Median | 52 |
| Minimum | 1 |
| Maximum | 623 |
| 1st row | What is the step b... |
|---|---|
| 2nd row | What is the story ... |
| 3rd row | How can I increase... |
| 4th row | Why am I mentally ... |
| 5th row | Which one dissolve... |
| Count | 19224859 |
|---|---|
| Lowercase Letter | 18127812 |
| Space Separator | 4020499 |
| Uppercase Letter | 1097047 |
| Dash Punctuation | 17915 |
| Decimal Number | 164204 |
categorical
| Approximate Distinct Count | 299174 |
|---|---|
| Approximate Unique (%) | 74.0% |
| Missing | 2 |
| Missing (%) | 0.0% |
| Memory Size | 51316133 |
| Mean | 60.1087 |
|---|---|
| Standard Deviation | 33.8637 |
| Median | 51 |
| Minimum | 1 |
| Maximum | 1169 |
| 1st row | What is the step b... |
|---|---|
| 2nd row | What would happen ... |
| 3rd row | How can Internet s... |
| 4th row | Find the remainder... |
| 5th row | Which fish would s... |
| Count | 19344492 |
|---|---|
| Lowercase Letter | 18227102 |
| Space Separator | 4117742 |
| Uppercase Letter | 1117390 |
| Dash Punctuation | 18365 |
| Decimal Number | 169561 |
categorical
| Approximate Distinct Count | 2 |
|---|---|
| Approximate Unique (%) | 0.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory Size | 26683140 |
| Mean | 1 |
|---|---|
| Standard Deviation | 0 |
| Median | 1 |
| Minimum | 1 |
| Maximum | 1 |
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
| Count | 0 |
|---|---|
| Lowercase Letter | 0 |
| Space Separator | 0 |
| Uppercase Letter | 0 |
| Dash Punctuation | 0 |
| Decimal Number | 404290 |